Large vocabulary Russian speech recognition using syntactico-statistical language modeling

نویسندگان

Alexey Karpov

Konstantin Markov

Irina S. Kipyatkova

Daria Vazhenina

Andrey Ronzhin

چکیده

Speech is the most natural way of human communication and in order to achieve convenient and efficient human–computer interaction implementation of state-of-the-art spoken language technology is necessary. Research in this area has been traditionally focused on several main languages, such as English, French, Spanish, Chinese or Japanese, but some other languages, particularly Eastern European languages, have received much less attention. However, recently, research activities on speech technologies for Czech, Polish, Serbo-Croatian, Russian languages have been steadily increasing. In this paper, we describe our efforts to build an automatic speech recognition (ASR) system for the Russian language with a large vocabulary. Russian is a synthetic and highly inflected language with lots of roots and affixes. This greatly reduces the performance of the ASR systems designed using traditional approaches. In our work, we have taken special attention to the specifics of the Russian language when developing the acoustic, lexical and language models. A special software tool for pronunciation lexicon creation was developed. For the acoustic model, we investigated a combination of knowledge-based and statistical approaches to create several different phoneme sets, the best of which was determined experimentally. For the language model (LM), we introduced a new method that combines syntactical and statistical analysis of the training text data in order to build better n-gram models. Evaluation experiments were performed using two different Russian speech databases and an internally collected text corpus. Among the several phoneme sets we created, the one which achieved the fewest word level recognition errors was the set with 47 phonemes and thus we used it in the following language modeling evaluations. Experiments with 204 thousand words vocabulary ASR were performed to compare the standard statistical n-gram LMs and the language models created using our syntactico-statistical method. The results demonstrated that the proposed language modeling approach is capable of reducing the word recognition errors. 2013 Elsevier B.V. All rights reserved.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of Pronunciation, Language and Nonverbal Units at Conversational Russian Speech Recognition

The main problems of a conversational Russian speech recognition system development are variability of pronunciation, free word-order in sentences and presence of speech disfluencies. In the paper, pronunciation variability is modeled by creation of multiple word transcriptions. A syntacticstatistical language model that takes into account long-distant word dependencies is proposed for Russian ...

متن کامل

Speech Recognition of Czech-Inclusion of Rare Words Helps

Large vocabulary continuous speech recognition of inflective languages, such as Czech, Russian or Serbo-Croatian, is heavily deteriorated by excessive out of vocabulary rate. In this paper, we tackle the problem of vocabulary selection, language modeling and pruning for inflective languages. We show that by explicit reduction of out of vocabulary rate we can achieve significant improvements in ...

متن کامل

Rescoring n-best lists for Russian speech recognition using factored language models

In this paper, we present a research of factored language model (FLM) for rescoring N-best lists for Russian speech recognition task. As a baseline language model we used a 3gram language model. Both baseline and factored language models were trained on a text corpus collected from recent news texts on Internet sites of online newspapers; total size of the corpus is about 350 million words (2.4...

متن کامل

Sequence memoizer based language model for Russian speech recognition

In this paper, we propose a novel language model for Russian large vocabulary speech recognition based on sequence memoizer modeling technique. Sequence memoizer is a long span text dependency model and was initially proposed for character language modeling. Here, we use it to build word level language model (LM) in ASR. We compare its performance with recurrent neural network (RNN) LM, which a...

متن کامل

Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis

In this paper, we present a word-based very large vocabulary automatic speech recognition system for Russian. Some novel methods are proposed for organization of the lexicon and the language model. Two-level morpho-phonemic prefix graph that uses some information on morphemic structure of lexical units is suggested for a compact representation of the pronunciation vocabulary and search space. S...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Speech Communication

دوره 56 شماره

صفحات -

تاریخ انتشار 2014

Large vocabulary Russian speech recognition using syntactico-statistical language modeling

نویسندگان

چکیده

منابع مشابه

Modeling of Pronunciation, Language and Nonverbal Units at Conversational Russian Speech Recognition

Speech Recognition of Czech-Inclusion of Rare Words Helps

Rescoring n-best lists for Russian speech recognition using factored language models

Sequence memoizer based language model for Russian speech recognition

Very Large Vocabulary ASR for Spoken Russian with Syntactic and Morphemic Analysis

عنوان ژورنال:

اشتراک گذاری